The circular drift-diffusion model (CDDM) is a stochastic sequential sampling model that describes choices and response times observed in tasks with a circular decision space (Smith, 2016). Like many other accumulator models, the CDDM assumes that participants accumulate information in favor of a given response option over time, until a response criteria is met. This is characterized as a random walk that starts at the origin of a circle representing the decision space, and that moves towards its circumference. Once the circumference is reached, the radian corresponding to the point of intersection and the amount of steps it took to get there are indicative of the choice and response times observed.
The CDDM considers four parameters, most of which are illustrated in Fig 1. The parameter not shown is the nondecision time parameter \(\tau\). The remaining parameters are the boundary radius parameter (\(\eta\)) that serves as a response criterion, and a doublet of parameters describing the overall speed and direction of the information accumulation process. These last two parameters can be expressed in cartesian coordinates (i.e., the mean displacement observed on the X and Y coordinates per unit of time \(\{\mu_x, \mu_y\}\)) or polar coordinates (i.e., the average speed and expected direction \(\{\delta,\theta\}\)).
Fig 1. Graphical illustration of the CDDM
In this document, we present a comparison between different sampling
algorithms that generate data under the CDDM. We will describe how each
of these algorithms work and highlight how different they are in terms
of the computation time they require and their accuracy. For starters,
we will generate n = 5000 bivariate observations using the
arbitrary set of parameter values shown below:
Each pair of observations is obtained by emulating the random walk process described by the CDDM. We emulate the information accumulation process by iteratively sampling random movements on the X and Y direction using \(\mu_x\) and \(\mu_y\), until the
par(pty="m", mar = c(3, 3, 3, 0))
hist(X.RW$bivariate.data[,2], main = "Response Times", col = col.RW(0.7))The execution of this first algorithm took approximately 15.3087 seconds.
Fig.2 Visual representation of the Reject sampling algorithm. The blue lines represent the joint density function of response times and choices prescribed by the CDDM. The dots represent the pairs of candidate value generated from the bivariate space of possible RTs and choices, plotted at the height of the random uniform(0, maxDensity) value used as rejection criterion. If the rejection criterion was lower than the density function (i.e., dot falls below density curve), the sample is accepted; if the rejection value surpasses the density (i.e., dot is drawn above), the candidate is rejected. The process of generating, testing and keeping candidates is repeated until the desired sample size is achieved.
The execution of this second algorithm took approximately 6.9418 seconds.
The following Metropolis algorithm uses the density function to generate random observations under the CDDM. Please read the comments I’ve left through the code to get a better idea on how this algorithm works. In order to work, this algorithm only needs a list par that specifies the values to use for each of the four parameters of the model (in either of its parameterizations, using polar or cartesian coordinates), and an arbitrary value max.RT that indicates the maximum possible R.T we’d expect to observe under a realistic scenario.
The execution of this second algorithm took approximately 5.4531 seconds.
par(pty="m", mar = c(3, 3, 3, 0))
hist(X.invCDF[,2], main = "Response Times", col = col.invCDF(0.7))The execution of this first algorithm took approximately 22.484 seconds.
# Load file containing custom eCDF function
source("./code/general_functions/eCDF.R")
Reject.eCDF <- myECDF(X.Reject)
RW.eCDF <- myECDF(X.RW$bivariate.data)
invCDF.eCDF <- myECDF(X.invCDF)# Approximate the theoretical CDF for the Random Walk data
start_time <- Sys.time()
RW.tCDF <- pCDDM(X.RW$bivariate.data, par$drift, par$theta, par$tzero, par$boundary)
end_time <- Sys.time()
RW_tCDF.time <- end_time-start_time
RW_tCDF.time <- round(RW_tCDF.time,4)Obtaining the approximate CDF for the Random-Walk data with a trapezoid algorithm took 5.9703 seconds.
# Approximate the theoretical CDF for the Reject data
start_time <- Sys.time()
Reject.tCDF <- pCDDM(X.Reject, par$drift, par$theta, par$tzero, par$boundary)
end_time <- Sys.time()
MC_tCDF.time <- end_time-start_time
MC_tCDF.time <- round(MC_tCDF.time,4)Obtaining the approximate CDF for the data generated with the Rejection sampling algorithm took 5.8577 seconds.
# Approximate the theoretical CDF for the Reject data
start_time <- Sys.time()
invCDF.tCDF <- pCDDM(X.invCDF, par$drift, par$theta, par$tzero, par$boundary)
end_time <- Sys.time()
invCDF_tCDF.time <- end_time-start_time
invCDF_tCDF.time <- round(invCDF_tCDF.time,4)Obtaining the approximate CDF for the data generated with the Rejection sampling algorithm took 5.8341 seconds.
# We build a simple function to compare these CDFs
getDifferences <- function(eCDF,tCDF){
difference <- tCDF - eCDF
difference.sum <- sum(difference)
KS_statistic <- max(abs(difference)) # Kolmogorov–Smirnov statistic
sq.difference <- sum((difference)^2)
output <- round(cbind(difference.sum,KS_statistic,sq.difference),4)
colnames(output) <- c("sumDiff","KS-statistic","SSDiff")
return(output)
}## sumDiff KS-statistic SSDiff
## [1,] 8.3044 0.0215 0.1574
## sumDiff KS-statistic SSDiff
## [1,] 0.4634 0.0156 0.1124
compareTime <- function(n.Samples, n.Datasets){
times.RW <- matrix(NA, nrow=n.Datasets, ncol=length(n.Samples))
times.Reject <- matrix(NA, nrow=n.Datasets, ncol=length(n.Samples))
for(m in 1:n.Datasets){
i <- 1
for(n in n.Samples){
start_time <- Sys.time()
sample.RW.cddm(n,par)
end_time <- Sys.time()
times.RW[m,i] <- round(end_time-start_time,4)
start_time <- Sys.time()
sample.Reject.cddm(n,par,max.RT,plot=FALSE)
end_time <- Sys.time()
times.Reject[m,i] <- round(end_time-start_time, 4)
i <- i+1
}
}
return(list("times.RW" = times.RW,
"times.Reject" = times.Reject))
}The Reject algorithm massively outperforms the Random Walk algorithm when the parameter values used for \(\delta\) and \(\eta\) imply a slower or longer walk.
# Arbitrary set of parameter values
par <- list("drift" = 1,
"theta" = pi,
"tzero" = 0.1,
"boundary" = 7)
n <- 5000 # No. samples## sumDiff KS-statistic SSDiff
## [1,] 11.886 0.0189 0.1019
## sumDiff KS-statistic SSDiff
## [1,] -31.7521 0.0243 0.3299
## sumDiff KS-statistic SSDiff
## [1,] 1499.534 0.9991 579.5664